A Smoothed Approximate Linear Program
نویسندگان
چکیده
We present a novel linear program for the approximation of the dynamic programming cost-to-go function in high-dimensional stochastic control problems. LP approaches to approximate DP naturally restrict attention to approximations that are lower bounds to the optimal cost-to-go function. Our program – the ‘smoothed approximate linear program’ – relaxes this restriction in an appropriate fashion while remaining computationally tractable. Doing so appears to have several advantages: First, we demonstrate superior bounds on the quality of approximation to the optimal cost-to-go function afforded by our approach. Second, experiments with our approach on a challenging problem (the game of Tetris) show that the approach outperforms the existing LP approach (which has previously been shown to be competitive with several ADP algorithms) by an order of magnitude.
منابع مشابه
The Smoothed Approximate Linear Program
We present a novel linear program for the approximation of the dynamic programming costto-go function in high-dimensional stochastic control problems. LP approaches to approximate DP have typically relied on a natural ‘projection’ of a well studied linear program for exact dynamic programming. Such programs restrict attention to approximations that are lower bounds to the optimal cost-to-go fun...
متن کاملApproximate Dynamic Programming via a Smoothed Linear Program
We present a novel linear program for the approximation of the dynamic programming costto-go function in high-dimensional stochastic control problems. LP approaches to approximate DP have typically relied on a natural ‘projection’ of a well studied linear program for exact dynamic programming. Such programs restrict attention to approximations that are lower bounds to the optimal cost-to-go fun...
متن کاملDecision Making Based on Approximate and Smoothed Pareto Curves
We consider bicriteria optimization problems and investigate the relationship between two standard approaches to solving them: (i) computing the Pareto curve and (ii) the so-called decision maker’s approach in which both criteria are combined into a single (usually non-linear) objective function. Previous work by Papadimitriou and Yannakakis showed how to efficiently approximate the Pareto curv...
متن کاملSmoothed analysis of termination of linear programming algorithms
We perform a smoothed analysis of a termination phase for linear programming algorithms. By combining this analysis with the smoothed analysis of Renegar’s condition number by Dunagan, Spielman and Teng (http://arxiv.org/abs/cs.DS/0302011) we show that the smoothed complexity of interior-point algorithms for linear programming isO(m3 log(m/σ)). In contrast, the best known bound on the worst-cas...
متن کاملValue Function Approximation in Noisy Environments Using Locally Smoothed Regularized Approximate Linear Programs
Recently, Petrik et al. demonstrated that L1Regularized Approximate Linear Programming (RALP) could produce value functions and policies which compared favorably to established linear value function approximation techniques like LSPI. RALP’s success primarily stems from the ability to solve the feature selection and value function approximation steps simultaneously. RALP’s performance guarantee...
متن کامل